Data definition

Data description

Organize dataset

Time series plot

Box plot

Yearly box plot

Monthly box plot

Monthly sales for Rose Wine across years for each month

Decompose the Time Series

Trend, Seasonality and Residual

Train test split & plot

Linear regression model – RMSE

We are going to regress the "Rose" variable against the order of the occurrence. For this we need to modify our training data before fitting it into a linear regression.

Plot

Accuracy Metrics and Model Evaluation

Model Evaluation

Naïve model – RMSE

Model Evaluation

Simple average model – RMSE

Model Evaluation

Moving average model – RMSE

For the moving average model, we are going to calculate rolling means (or moving averages) for different intervals. The best interval can be determined by the maximum accuracy (or the minimum error) over here.

For Moving Average, we are going to average over the entire data.

Trailing moving averages

Plot

Split the data into train and test and plot

Model Evaluation

Plot of all models derived till now

Simple exponential smoothening – RMSE analysis

Model Evaluation

Set different alpha values

Plot

Double exponential smoothening (Holt’s method) – RMSE analysis

Triple exponential smoothening (Holt’s winter model) – RMSE analysis

RMSE

Identify best alpha, beta and gamma

Plot SES, DES and TES

Training TES model with full data

Margin of Error

Stationarity check with AdFuller

Build ARIMA model with lowest AIC score – test this model on test data using RMSE

Prediction

Manual Arima using ACF and PACF plots

Build SARIMA model with lowest AIC score – test this model on test data using RMSE

Prediction on the Test Set and Evaluation

SARIMA for 12 Seasonality

Predict on the Test Set and Evaluation

Manual SARIMA model - Best Params seleced from ACF and PACF plots - Seasonality 6

We see that there might be a slight trend which can be noticed in the data. So we take a differencing of first order on the seasonally differenced series.

check the stationarity of the above series before fitting the SARIMA model.

Checking the ACF and the PACF plots for the new modified Time Series.

Here, we have taken alpha=0.05.

We are going to take the seasonal period as 6. We will keep the p(0) and q(0) parameters same as the ARIMA model.

The Auto-Regressive parameter in an SARIMA model is 'P' which comes from the significant lag after which the PACF plot cuts-off to 2. The Moving-Average parameter in an SARIMA model is 'Q' which comes from the significant lag after which the ACF plot cuts-off to 2.

Prediction on the Test Set and Evaluation

Build table with all the above models with RMSE scores

Build most optimum model on the Full Data

Use optimal model with lowest RMSE to predict 12 months into future with a plot and confidence intervals